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ABSTRACT 

In this paper, we develop a method of performing the one-point statistics 
of a perturbed density held with a multiresolutional decomposition based on 
the discrete wavelet transform (DWT). We establish the algorithm of the one- 
point variable and its moments in considering the effects of Poisson sampling and 
selection function. We also establish the mapping between the DWT one-point 
statistics in redshift space and real space, i.e. the algorithm for recovering the 
DWT one-point statistics from the redshift distortion of bulk velocity, velocity 
dispersion, and selection function. Numerical tests on N-body simulation samples 
show that this algorithm works well on scales from a few hundreds to a few h“^ 
Mpc for four popular cold dark matter models. 

Taking the advantage that the DWT one-point variable is dependent on both 
the scale and the shape (conhguration) of decomposition modes, one can de¬ 
sign estimators of the redshift distortion parameter (/?) from combinations of 
DWT modes. Comparing with conventional estimators, such as quadrupole- 
to-monopole ratio, the DWT [3 estimators are scale-decomposed. It is useful to 
consider scale-dependent effects. When the non-linear redshift distortion is not 
negligible, the quadrupole-to-monopole ratio is a function of scale. This esti¬ 
mator would not work without adding information about the scale-dependence, 
such as the power-spectrum index or the real-space correlation function of the 
random held. The DWT (3 estimators, however, do not need such extra informa¬ 
tion. Moreover, the scale-decomposed {3 estimators would also be able to reveal 
the scale-dependence of the bias parameter of galaxies. Numerical tests show 
that the proposed DWT estimators are able to determine (3 robustly with less 
than 15% uncertainty in the redshift range 0 < 2 ; < 3. 

Subject headings: cosmology: theory - large-scale structure of the universe 
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1. Introduction 

The one-point probability distribution function (PDF) of a random mass density field 
p(x), or the counts-in-cells (CiC) statistics of a discrete random distribution such as galaxies, 
is probably the hrst statistics used to reveal the clustering feature of galaxy distribution^. 
In the famous work by Edwin Hubble (1934), he showed that the frequency distribution of 
galaxy count N in angular cells is not Gaussian, but lognormal. This result indicates that 
the PDF of galaxy distribution is fundamental in characterizing the cosmic mass and velocity 
helds. 

Although current samples applied for large-scale structure study have much deeper 
redshift and much wider angular size than available in Hubble’s time, the one-point statistics 
is still frequently applied. This is because the one-point distribution and its moments contain 
complete information of the held, which might not be easily detected by other conventional 
methods. Unlike the Fourier amplitude, CiC is not subject to the central limit theorem. It 
can detect the non-Gaussianity of a held consisting of randomly distributed non-Gaussian 
clumps, while the PDF of the Fourier amplitudes is still Gaussian due to the central limit 
theorem (Fan and Bardeen 1995). Even the 2nd moment of one-point distribution is diherent 
from the 2nd moment of the Fourier decomposition - power spectrum. The former contains 
perturbations on scales larger than the size of the observed sample, while the latter does 
not. Therefore, one-point statistics is applied on various samples of large-scale structures, 
including galaxy surveys (e.g. Hamilton 1985; Alimi et al 1990; Gaztahaga 1992; Szapudi 
et al 1996; Kim & Strauss 1998), transmitted hux of quasars’ Lya absorption spectrum 
(e.g. Meiksin & Bouchet 1995; Gaztanhaga & Croft 1999; Zhan & Fang 2002), and N-body 
simulation samples (e.g. Coles & Jones 1991; Taylor & Watts 2000). 

One-point statistics of a density huctuation held S(x) is given by the distribution of the 
one-point variable (5/j(xo), which is a sampling of the held by a window function IFr(x — xq) 
around position xq and on scale R, i.e. 

(5ij(xo) = J Wr{x - xo)(5(x)(Jx. (1) 

The variable J_r(xo) is actually a mean value of the held at position xq and on scale R. The 
distribution of Jij(xo) gives a PDF description of the held. 

Eq.(l) is a space(xo)-scale(i?) decomposition of the held J(x) with bases IFr(xo). Most 
popular windows Wr{x — xq) are Gaussian and top-hat hlters. The Gaussian windows gener- 


^Since we study only the one-point PDF in this paper, not N-point PDF, hereafter, PDF stands only for 
the one-point statistics. 
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ally are not orthogonal, i.e. the function hhR(x — xq) does not satisfy f hhR(x — xi)fhi{(x — X 2 )(ix 
0, if Xi 7 ^ X 2 . Thus, the variables (5ij(xo) are either incomplete or redundant. In turn, these 
may lead to 1) loss of information of the held (5(x); and 2) false correlation. It is possible 
to construct an orthogonal and complete set of bases from top-hat windows. However, they 
are not localized in the Fourier space. The index R does not refer to a well dehned scale 
k. As a consequence, the one-point statistics with conventional windows are not suitable 
for problems with a scale-dependence. For instance, the redshift distortion on the one-point 
statistics can be properly estimated only if we know how the redshift distortion depends on 
the scale and shape of the window function. Therefore, a better algorithm for the one-point 
statistics is needed, for example, to recover the real-space rms density huctuation ag from 
redshift distortion. 

In this paper, we show that these problems can be solved with a multiresolutional decom¬ 
position via the discrete wavelet transform (DWT). The bases of the DWT decomposition 
are complete, orthogonal, and localized in both physical and Fourier spaces. It was shown in 
the last few years that the DWT can be employed as an alternative representation in most 
conventional statistics of cosmic mass and velocity helds, including power spectrum (Pando 
& Fang 1998; Fang & Feng 2000; Yang et al 2001a, 2002), high order correlations (Pando 
& Fang 1998; Pando, Feng & Fang 2001; Feng, Pando & Fang 2001), bulk and pair-wise 
velocity (Zhan & Fang 2002; Yang et al. 2001b), and identihcation of halos and clusters 
(Xu, Fang & Wu 2000). The DWT representation is also able to reveal statistical features, 
which might not be easily detected without a set of decomposition bases localized in both 
physical and scale spaces. With the DWT analysis, the intermittency of the fluctuations of 
quasars’ Lya transmitted flux has been detected (Jamkhedkar, Zhan & Fang 2000; Pando, 
Feng & Fang 2000; Zhan, Jamkhedkar & Fang 2001; Pando et al. 2002). This property 
cannot be simply detected by the popular IFr(x — xq) or Fourier decompositions. 

We also show that the DWT decomposition is very useful for one-point statistics. We 
establish the algorithm of the one-point variable and its moments in considering various 
corrections, such as Poisson sampling and selection function. To demonstrate the advan¬ 
tage of the DWT one-point statistics, we show that one can map between the one-point 
statistics in real and redshift spaces scale-by-scale, taking account the distortion due to bulk 
velocity, velocity dispersion and selection function. With these results one can construct (3 
(redshift distortion parameter) estimators with moments of the DWT one-point variables. 
Unlike conventional estimators, the DWT estimators do not need extra-information or ad 
hoc assumption when the non-linear redshift distortion is not negligible. 

This paper is organized as follows. §2 presents the algorithm for the PDF of one-point 
variables with the DWT space-scale decomposition. §3 discusses the second moment of the 
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one-point variables in considering the effects of Poisson sampling and selection function. 
In §4 we develop the theory of redshift distortion of the DWT one-point variables. §5 
tests the theoretical results in §4 on N-body simulation samples. The emphases are the 
redshift-to-real-space mapping of the diagonal and off-diagonal second moments. §6 shows 
the application in estimating the redshift distortion parameter. Finally, the conclusions 
and discussions are given in §7. The Appendix provides relevant formulae for quantities 
dehned in §4. We release the codes for calculating these quantities via anonymous ftp at 
samuri .la. asu.edu/pub/zhan / DWTCiC. tgz. 


2. One-point statistics in the DWT representation 
2.1. DWT variable of one-point statistics 

As emphasized in §1, a key problem for one-point statistics is to have proper window 
functions Wr{x.), and dehne the variable with eq.(l). In the DWT analysis, this is done 
by the so-called scaling function. Let us first briefly introduce the DWT-decomposition 
of a random held. For the details of the mathematical properties of the DWT see Mallat 
(1989a,b); Meyer (1992); Daubechies (1992), and for physical applications see Fang & Thews 
(1998). 

To simplify the notation, we hrst consider a one-dimensional (1-D) density huctuation 
6{x) on a spatial range from a; = 0 to L. We hrst divide the space L into 2^ segments labeled 
by / = 0,1, ...2^ — 1. Each segment is of size L/2L The index j can be a positive integer. It 
stands for a length scale L/2L The higher the j, the smaller the length scale. Even though 
we often refer some properties to a (range of) j in the analyses below, it must be read as an 
association of the properties with the length scale L/2L The index I is for position, and it 
corresponds to the spatial range lL/2^ < x < {l + l)L/2K That is, the space L is decomposed 
into cells (j, /). 

In the DWT analysis, each cell (j, 1) supports two compact functions: the scaling func¬ 
tion (pj^i{x) and the wavelet 'ipj^i{x). One example of such functions is the Daubechies 4 
(D4) wavelets^, which have to be constructed recursively (Daubechies 1992; see also Fang & 
Thews 1998). The properties of D4 wavelets are listed below and in the appendix §A.l. We 
show, in Fig. 1, examples of the D4 scaling function and wavelet, and their Fourier transform 
(l>j,i{k) and 'tpj^i{k). One can see from Fig. 1 that both the scaling function and the wavelet 


^Unless mentioned in contrast to the scaling functions, wavelets, as analysis tools, always involve the 
scaling functions. 
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are localized in Fourier space as well as in physical space. Generally, 0yz(a;) and are 

localized in cell (j,/), is localized in the scale range \k\ < 27r2-^/L, and 'ipj^iik) in the 

range k ± k/2, where \k\ = 27i2^/L. 

Obviously, the scaling function (t>j,i{x) is a window function on scale j and around the 
segment 1. It can be used to measure the mean field in cell {j, 1) 

jr _ lo S{x)(j)j^i{x)dx _ 1 

Jq (pj,i{x)dx Jq (pj^i[x)dx 

where is called scaling function coefficient (SFC), given by 

eyz = / 6{x)(j)j^i{x)dx. (3) 

Jo 

In analogous to eq.(l), eqs.(2) and (3) show that the SFC e^^z or 5ji can be employed as the 
variable for one-point statistics. 

Fig. 1 also demonstrates that wavelets are admissible, i.e. / 'ipj^i{x)dx = 0, and therefore, 
they are used to measure the fluctuations of a held with respect to its mean in cells (j, /) 


G.z = / S{x)'ipj^i{x)dx, 


(4) 


where ij^i is called the wavelet function coefficient (WFC). 

The scaling function 4>j,i{x) and the wavelet ipj^iix) satisfy a set of orthonormal relations 
as 


I (j)j^i{x)(j)j^i'{x)dx = 6^1,, 

'ipj,i{x)'ipf^i'{x)dx = 

(l)j,i{x)^j',i'{x)dx = 0, if f > j, 


(5) 

( 6 ) 
(7) 


where is the Kronecker delta function. To be consistent with eq.(5), the scaling function 
(f>j^i{x) is normalized to 


j (t)j^i{x)dx 


( 8 ) 


With these properties, a 1-D density held 5{x) can be decomposed into 


2^-1 oo 2 j'-i 

1=0 j'=j 1=0 


( 9 ) 
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The second term on the r.h.s of eq.(9) contains only the information of the held 6{x) on 
scales equal to and less than L/ 2 T Therefore, the 2^ variables (SFCs) ej^i, with / = 0 ... 2 -^ — 1 , 
completely describe the behavior of the held on scales larger than L/2T For instance, if 
the resolution of the sample is L/2'^, S(x) can be described by the 2 *^ SFCs ej^/. Because 
of the orthogonal relation eqs.(5) and (7), the 2^ variables (SFCs) are independent and 
irredundant. 

In a functional space consisting of functions, which are smoothed on scales equal to and 
less than L/ 2 '^, the completeness of the scaling function 0 yz(a;) can be expressed as 

2^-1 

^ (/)j^i(x)(/)j/x') = 6^(x - x'), ( 10 ) 

/=o 

where S^(x — x') is the Dirac Delta function. 

All the above-mentioned results can be easily generalized to three-dimensional (3-D) 
helds. Let us consider a 3-D distribution S(x) in a volume x = (0,0,0) to (Li, L 2 , L^). 
Similar to the 1-D case, we divide the volume Li x L 2 x L 3 into cells (j, 1), where j = (ji, J 2 , is) 
refers to the length scale (Li/ 2 -^b L2l2^^, L^/ 2 ^^), and 1 = (/i, I2, 13) refers to the spatial range 
of the cell, liLi/2^* < Xi < {U + l)Lj/ 2 F and U = 0 ... 2 F — 1 , where i = 1, 2, 3. The one-point 
statistical variable of the held 6(x) is 

ej,i = J (5(x)0j,i(x)dx, (11) 

where the 3-D scaling function (3-D window) 0j,i(x) is given by a direct product of 1-D 
scaling functions as 

<^j,l(x) = 13 (^ 3 )- ( 12 ) 

One can also construct 3-D wavelets t/’j,i(x) as a direct product of 1-D scaling functions 
and wavelets (see Yang et al 2002), and generalize eq.(9) into 3-D as 

2 J 1-1 2 .^ 2-1 2 «-l 

'^(^) = X] ^ ^ + terms of with j' > ji. (13) 

/l =0 ^ 2—0 ^ 3=0 

Since the second term on the r.h.s. is not needed in calculating the one-point statistics, 
therefore, we do not show the details of 'ipyy-, but only give the orthogonal relations 

j 0j,i(x)^/’jM'(x)dx = 0, if j' > ji, i = 1, 2, 3. (14) 

If we consider only functions smoothed on scales equal to and less than the scale of j in each 
dimension, the set of bases 0j,i(x) is orthogonal and complete. 
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2.2. Moments of DWT one-point distribution 


In one-point statistics, the statistical feature of the held h(x) is characterized by the 
moments of the distribution of ej^. The mth moment is where (...) stands for an 

ensemble average. Using eq.(ll), we have 




) = / ((5(xi)...(5(x„))0j,i(xi)...0j,i(xm)dxi...(ix„ 


(15) 


For a homogeneous held, these moments are 1-independent. Moreover, when the “fair sample 
hypothesis” (Peebles 1980) holds, or equivalently, when the random held is ergodic, the 
2h+n+j3 gFCs ej^i, k = — 1, i = 1, 2, 3 for a given scale j can be considered as 2T+)'’2+-?'3 

independent measurements, because they are measured by projecting onto the mutually 
orthogonal basis (;/)j,i(x). Accordingly, the SFCs form a statistical ensemble on the 

scale j. This ensemble represents actually the one-point distribution of ej^ over the DWT 
modes at a given scale j. Thus, the average over 1 is a fair estimation of the ensemble average, 
i.e. 



1 

2ii+i2+i3 


2 h-l 2 ^ 2-1 2 ^ 3-1 


E Z E Cl 

/l=0 l2=0 /3=0 


(16) 


We now consider the 2nd moment (e|j). From eq.(15), the 2nd moment is determined 
by the usual two point correlation function ((5(xi)(5(x2)). We dehne a dimensionless 2nd 
moment on scale j by 


9(ii+i2+i3) 

" -Ly;L7<7.) = 


L 1 L 2 L 3 


231-1 232-1 233-1 

E E E 7i 

/l=0 ^2=0 /3=0 


(17) 


Using eqs.(12) and (15), one can rewrite eq.(17) as 
1 


00 00 


= 

L 1 L 2 L 3 


E E E |0(ni/2^i)0(n2/2^")0(n3/2^3)pP('^^^^2,n3), (18) 


ni = — 00 722 = —00 723 = —00 


where 4>{n) is the Fourier transform of the basic scaling function (see eq.(Al) in Appendix 
A). The term P{ni,n 2 -,n^) in eq. (18) is the Fourier power spectrum dehned by 

P(ni,n2,n3) = ((5(k)(5^(k)), (19) 


where k = (/ci, k 2 , k^) with ki = 2nni/Li, and 


m = 


1 


('Ll /'L 2 I'Lz 


(5(x)e-'^"dx. 


L 1 L 2 L 


3 Jo Jo Jo 


( 20 ) 



One can compare the 2nd moment D-^ with the DWT power spectrum given by Fang & 
Feng (2000) as 

^ OO OO OO 

= 2 h+ 32 +h I^K/2-^')^(«2/2^'")^(n3/2'^'")pP(ni,n2,n3), (21) 

m=—OO n2=—OO n^=—oo 

where ip{n) is the Fourier transform of the basic wavelet. Therefore, both the DWT power 
spectrum and the DWT one-point statistics can be calculated with a single DWT decompo¬ 
sition. The former relies on the wavelet, while the latter on the scaling function. 

The Fourier transform of the scaling function 0(n) is non-zero in n-space where \n\ < 1, 
and ip{n) is mainly in 1/2 < |n| < 1 (Fig. 1 and eq.(A4)). Therefore, the DWT power 
spectrum Pj is actually a banded Fourier power spectrum in the wavenumber range 7r2'^*/Lj < 
\ki\ < 7r2F+i/Lj, while the 2nd moment Dj contains all powers with \ki\ < 7r2F+^/Lj. It is 
well known that the 2nd moment of one-point statistics, such as Dj or as, is sensitive to long 
wavelength behavior of the perturbations. This can also be seen from the relation between 
the scaling functions and wavelets as follows 

ikoW? = = Yl ( 22 ) 

n=l 

That is, the factor \(j)jfi{k)\‘^ in eq.(22) extracts all powers on scales larger than Eq.(22) 

holds only for compactly supported discrete wavelets such as Daubechies wavelets. 


3. One-point statistics of galaxy distribution in the DWT representation 

3.1. Galaxy distribution 

We now consider distributions of discrete objects, such as simulation particles and ob¬ 
served or mock galaxies. If the position measurement of particles or galaxies is perfectly 
precise, the number density distribution of these samples can be written as 

Ns 

n^(x) = y^w^(5^(x- x^) = n(x)[l-h (5(x)], (23) 

m=l 

where Ng is the total number of the particles or galaxies, x^ is the position of the mth galaxy, 
Wm is its weight, and h(x) is the selection function, which is given by the mean number 
density of galaxies when galaxy clustering is absent, and 6(x) is the density fluctuation in 
the distribution. 
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One can subject n®(x) to a DWT decomposition. Similar to eq.(13), we have 

2J1-1 2 ^ 2-1 2 ^ 3-1 


n- 


E E e|j 0 j^i(x) + terms of ^jJy^y with j^ > ji, 


/l=0 ^2=0 ^3=0 


where 


No 


- 

0,1 “ 


n®(x)0j,i(x)dx = ^ tc^0j,i(x„). 


m=l 


The DWT one-point variables of 6(x) is now given by 


0,1 = / <^(x)0j,i(x)dx = 


n^tx 


nix) 


0j,i(x)(ix. 


Since the selection function varies slowly, we have 


n^[x) 


j,i 


-(j)\ \(x)dx ~ ^— / n^{x)(h\ \(x)dx = 
n[x) ' nyiJ ^ ^ J’ ^ ^ nyi 


(24) 


(25) 


(26) 


(27) 


where ny\ is the mean of the selection faction in the cell (j, 1). The algorithm for nyi is given 
in next subsection. Substituting eq.(27) into eq.(26) we have 


0,1 

0,1 - — 
^j,l 


LIL 2 L 2 


2ii+i2+i3 

The second term on the r.h.s. is due the normalization of the scaling function eq.( 8 ). 


(28) 


3.2. Selection function in the DWT representation 

By dehnition of equation (23), selection function h(x) is the galaxy distribution if galaxy 
clustering 6(x) is absent. In the plane-parallel approximation, selection function depends 
only on 0 : 3 , i.e. the coordinate in the redshift direction or the line-of-sight (LOS). Thus, 
from equation (23) the selection function n(x 3 ) can be approximated by an average of n^{x) 
over the plane {xi,X 2 ), which depends upon the geometry of a real survey. In a simple case, 
for example, a mock survey in a simulation box, it reads 

n{x3) = [ [ n^{x)dxidx2- (29) 

-^1^2 Jo Jo 

With eq.(24), eq.(29) yields 

273-1 

n{x3) = ^ooj3,ooz3‘(’i3, 13 (^ 3 )• 

^3=0 


(30) 
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By definition, is the mean of n{x^) in the cell (j, 1). Using eq. (30), we have 



That is, the selection function can be approximately expressed by the SFC of the (observable) 
galaxy distribution n®(x). 


3.3. Effect of Poisson sampling 


The observed galaxy distributions n^(x) are considered to be a Poisson sampling with 
an intensity n(x) = h(x)[l + 5(x)]. In this case, the characteristic function of the galaxy 
distribution n^(x) is 

([e'^’^'(x)“(xU"])p = exp jy dxn(x)[e'“W _ i]| ^ (32) 

where is the average for the Poisson sampling. The m-point correlation functions of 

n^(x) are given by 


(n®(xi)...n®(x™))p = - 


5^Z 


We have then 

and 


(n5(x))p = n(x). 


c-D/ 


Substituting n(x) = n(x)[l + (5(x)] into eq.(35), we have 

(n5(x)n^^(x'))p\ / 


((5(x)(5(x')) = 


-(<5^(x-x' 


Xm)_ 

w=0 

(33) 



(34) 

■ x')n(x). 

(35) 


(36) 


Subjecting eq.(36) to a scaling function projection, and using eq.(27), we have approximately 

_ T 1 T 2 T 3 


(ej,iej/,i/) = 


,3 ,3 

U,i d',1' 


(37) 


The last term in eq.(37) is due to the normalization eq.(8). From eq.(37), the 2nd moment 
Dj is given by 


^j = 


L 1 L 2 L 3 


2 -^ 2-1 2 -^ 3-1 

ZEE 

Zi=o / 2 =o / 3 =o L 


3 \ 2 

*^j.i 

^j,i 


1 


1 . 


(38) 


The second term on the r.h.s. under the summation is the correction due to Poisson noise, 
and the rest is the normalized 2nd moment of (5(x). One can also calculate the Poisson 
correction for higher order moments with eq.(33). 
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4. Redshift distortion of one-point statistics 
4.1. DWT CiC variables in redshift space 

Actually, astronomical data often gives only the distribution in redshift space, i.e. 

Ng 

= X] - rVr{xm)/H] = h'^(s)[l + (5'^(s)], (39) 

m=l 

where ^^(xj) is the radial (r) component of the velocity of the fth galaxy, n^{s) is the selection 
function in redshift space, and H is the Hubble constant at the corresponding redshift. Thus, 
we can only calculate the CiC moments in redshift space. 

For a given mass held 5(x), the galaxy velocity v(x) is a random held with mean 

V(x) = (v(x))„, (40) 

where (..)„ is the average over the ensemble of velocities. The mean velocity V(x) is also 
called bulk velocity at x, which is assumed irrotational (Bertschinger & Dekel 1989; Dekel, 
Bertschinger & Faber 1990; Dekel et al. 1999). In linear regime, the bulk velocity is related 
to the density contrast by 

^(x) =-;^V . V(x), (41) 

where the parameter (3 ~ at present, i.e. redshift z = 0, and b is the linear bias 

parameter. The rms deviation of velocity v(x) from the bulk velocity V(x) is 

([Ui(x) -ld(x)]2)^ = [a^'(x)]^ i = 1,2,3. (42) 


In order to express the scale dependence of the bulk velocity ld(x) and the variance 
(t’^(x), we can also decompose the velocity held v(x) with a DWT, i.e. 

= j D(x)0j,i(x)dx. (43) 

Obviously, the variance (t’^(x) on scale j is given by 





2ii+i2+i3 

L1L2L3 



(44) 


Although is 1-dependent, aj should be 1-independent if the random held v(x) is ergodic. 
On large scales, say > 10h“^Mpc, the velocity held v(x) is roughly Gaussian, and it can be 
described by its mean and variance, which are generally scale-dependent. 



4.2. Redshift distortion without selection function effect 


Because of eq.(39), the directly measurable DWT variable is not given by eqs. (26) or 
(28), but 


efi = / 5^(s)0j,i(s)ds = / [n^(s)/n^(s) - l] 0j,i(s)(is 


(45) 


Na 


1 A ( I - ( WfJ'i I L1L2L3 


^j.l m=l 


2ii+i2+i3 


where n?j is calculated in the same way as eq.(31), but with n'^(s) to replace n^(x). 

In this section, we do not consider the effect of selection function, i.e. n = const. In the 
plane-parallel approximation, i.e. r is along the afs-direction, eq.(45) becomes 


^41 = 


n 


A.#,, l(x)(ix - \l ^ti+h+A 


(46) 


If the velocity held is Gaussian, subjecting eq.(46) to an average over the ensemble of veloc¬ 
ities, we have 




n 


L 1 L 2 L 3 

2ii+i2+i3 


(47) 


[1 + + 0j,i(x)dx - sj 


L 1 L 2 L 3 

+j 2 +j 3 ' 


For clarity, the angle brackets (.. .)^ are dropped hereafter without causing any confusion. 
If we consider only the linear effect of the bulk velocity, equation (47) is approximately 

1 d 




l + f"3 X 


H dx-i 




L 1 L 2 L 3 
2ii+i2+i3 ■ 


(48) 


Neglecting the terms of the order of V3(x)(5(x), and using the linear relation between 6(x) 
and V (x), we have 


G,i - 


1 -h S(x) - 13 


7-2 


A 

dx:. 


■(5(x) 


0j,i(x)dx -\j (49) 


(9x3 


= / e 


.(,jj(x)dx + 5^€j,,, / 4,,(x)e4'’;'’-"'>4A) .#,j,(x)* 

J y J 

/0j,r(x)e2("'/^^'(^) V-2^0j,i(x)dx- 

1 / ^ 


LIL 2 L 3 


l+j2+j3 
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In the last step, integration by parts for d/dx^ and the completeness relation equation (10) 
are used. The summation of 1 in eq.(49) is over /j = 

To simplify eq.(49), we dehne the following matrices 



/ 




(50) 


where the differential operators A and B are 




(51) 


Obviously depends on A1 = (|/i —/(|, |/2 —/ 21 ,1^3 —^sl), not 1 or 1' individually. Appendix 
A provides the algorithms to calculate 

Thus, eq.(49) becomes 



L1L2T3 

2ii+i2+i3 


+ 



(52) 


Using the so-called “partition of unity” (Fang & Feng 2000), one can show that 

Zdh-l- (53) 


Therefore, eq.(52) gives 


4i-I](4'u'+'®4u')'w 


(54) 


This is the mapping of one-point variables between real and redshift spaces. 


4.3. Redshift distortion of the 2nd moment 


The second moment of one-point statistics in redshift space is 




4,1" 


(55) 


One can show that, even for a weakly non-linear held, both (ej^ej^/) and 7j^n/ are symmetric 
and quasi-diagonalized with respect to 1 and 1', and (|ej^ip) should be 1-independent. The 
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wavelet counterpart, i.e. (ej^ej^') ~ (5n/(ej^i), has been shown by Pando, Feng & Fang (2001). 
Thus, eq.(55) becomes 




(56) 


Using the completeness relation equation (10), we have 

E ai,fei a2,fe2 _ ai+a2,bi+b2 


(57) 


Thus, eq.(56) gives 


(kfiH ^ (7^0,0 + 2/97^0 + /9^7So)(|ej,iH, (58) 

where we have used = 7j^oo- Therefore, we have finally the mapping of between real 
and redshift spaces 

(59) 


Oj - bllo + 2;37i;i„+ 0s;:«.oiDi. 


2,1 


32 , 2,2 


4.4. Effect of selection functions 

In linear approximation, it is reasonable to estimate the effect of redshift distortion of 
n'^(s) and h‘^(s) separately. In this case, one can still use eq.(54) as the mapping from e?j 
to Cj^i. We only need to study the effect of the mapping between h'^(s) and h(x), which is 
given by 


(n'^(s))„ = (n[x + rVr{x)/H])y ~ n(x) + —K-(x)f ■ Vn^(x), 

M 


(60) 


where we have used {vr)v = W- With the plane-parallel approximation of selection function 
)3.2), eq.(60) becomes 

1 rlri ( 

(61) 


n'’(s) = n(x3) + 7^?d^4v'3(x), 


H dxs 

where we have dropped {...)v for h‘^(s). /^From equation (41), V 3 can be represented by 5(x), 
so we have 

dlnn^x^) d 


n*(s) = n{xs) 


1-/3- 


dx3 dxs 


V-^(5(x) 


(62) 


clx^ ux^ 


X 


n{xs) 


- 1 


Combining eqs.(54) and (62), we have finally 


0,1 I n i,ii <^1,1' I ^91nn(a;3) ^ ej,i' 

'7,1 1/ 'b,i “"''S jj 'b,i' 


(63) 
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where d\nn^[x^)/dx 2 ,\i,\ stands for the mean value of d\\in^[x^)/dxz in the cell (j,l), and 
hjj is given by eq.(31). In the last term of eq.(63), /i = and I 2 = and the summation 
runs only over The coefficient (5j,i,i' is dehned by 





^xh,h,h (^) 


A. 

dx? 


V 0 j,ii,/ 24 (x)dx. 


(64) 


The calculation of is given in Appendix A. 

Because all 1-diagonal elements of Qj,i,i' are zero (Appendix A), and are quasi-1- 
diagonal, the hrst and the second terms on the r.h.s. of equation (63) are not correlated. 
We have then 



~ (5'j,b,o + 2/37j(o,o 


1,2 



+ 




dlnn^x^) 


dx'i 


j,i 





(65) 


For a uniform held, (|e?i/n?ip) and (|ejj/nj^ip) are 1-independent. Thus, equation (65) gives 




7j°o,o + 2/57j,o,o + + 


1,2 


2 2,2 


P 


dlnn^Xs) 


dx^ 


2 


j,i 




Qo,v ^r 


( 66 ) 


Using the inequality equation (A16), we can show that if 

d\nn{x^) 2-^3 

dx3 ^ (27r)T2L3’ 

we have 

- 2 

E < 77- 

'J4j q-z' 

That is, the selection function term in equation ( 66 ) is even less than the second order terms 
if the selection function is slowly varying with 0 : 3 . 



P 


d\Yin{x^) 

dx?, 



5. Testing the DWT mapping between real and redshift spaces 

5.1. Simulation samples 

We use N-body simulation samples to test the DWT algorithms for recovering the one- 
point statistics from redshift space to real space. The model parameters are listed in Table 
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1. Notice that the simulation boxes are relatively large to minimize the effect on the bulk 
velocities due to the missing power at long wavelengths (Tormen & Bertschinger 1996). 

We use a modihed P^M code (Jing & Fang 1994) to evolve 128^ (256^ for LCDM2) cold 
dark matter (CDM) particles in a periodic cube of length L on each side. The linear power 
spectrum is given by the htting formula in Bardeen et al. (1986). Zel’dovich approximation 
is applied to set up the initial perturbation. The particles evolve 600 (800 for LCDM2) 
integration steps from Zi = 12 down to = 0 for all the models. 


In addition to /? at ~ 0, we can also test the redshift distortion algorithms at high 
redshifts. In this case, /? is a function of hi and A as (Lahav et al. 1991) 

H 0.6 


(3{z,VL,k) ~ 


12(1 + zY 


(69) 


[fi(l + 2)!> + (1 - f! - A)(l + 2)2 + A_ 

The simulation code is modihed to generate light-cone outputs from = 3 to 0 by a similar 
method in the Hubble Volume Simulations (Evrard et al. 2001). Instead of producing 
spherical light-cones, we apply the plane-parallel approximation in which a plane (light- 
front) sweeps through the simulation box at the speed of light. Let the LOS be along the 
xs-axis, and the position of the light plane at time t be x^{t). The position and velocity of 
a particle is recorded when it crosses the plane, i.e. when the position Xs{t) of the particles 
satishes 

xsit) = x^{t). (70) 


Since the time step At in the simulations is hnite, it is computationally impractical to use 
eq.(70) directly. The position of the plane in time interval from step i to step i -|- 1 is 
approximately 

(A aAt) ~ Xg (A) -h a[x^{ti+i) - Xg (A)], (71) 

where a is from 0 to 1. On the other hand, for a particle we have a;g(tj -|- aAt) ~ x^{ti) -|- 
aAt vsiti), where v^iti) is the velocity of the particle along the LOS. Thus, if we can End a 


Table 1 


Model 

L/h ^Mpc 

0 

A 

T 

0-8 

run 

particle 

LCDMl 

800 

0.3 

0.7 

0.225 

0.95 

6 

128^ 

LCDM2 

800 

0.3 

0.7 

0.21 

0.81 

1 

256^ 

OCDM 

800 

0.3 

0.0 

0.225 

0.95 

6 

128^ 

SCDM 

800 

1.0 

0.0 

0.50 

0.62 

6 

128^ 

TCDM 

800 

1.0 

0.0 

0.25 

0.60 

6 

128^ 


Table 1: Models of N-body simulations. 
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solution of |q;| < 1 from the following equation 

xsiti) + aAt VsiU) = x^{ti) + a[x^{ti+i) - x^{ti)], (72) 

we have X3(ti + aAt) ~ x^(ti + aAt), i.e. the particle crosses the plane at time t* + aAt. We 
then record its position x(tj) + aAt v(ti) and velocity v(tj). The accuracy is satisfactory. 
For example, the difference between 0:3 (t* + aAt) and (t* + aAt) is typically less than 
50 h“^kpc out to redshift 2 : = 2.5 in the LCDMl model. In the actual simulations, the 
integration variable is a, the cosmic expansion factor, instead of t. 

Since the size of the simulation box is less than the distance swept by the plane from 
redshift = 3 to 0, the motion of the plane is realized by periodic extensions of the simulation 
box when the plane meets the boundary. This treatment should not have signihcant effects 
on our analyses, because the simulations already impose a periodic boundary condition, 
and the largest scale analyzed for each simulation is only a quarter of its box size. Once 
the real-space light-cone outpnt is obtained, the observed redshift of a particle is given by 
Zobs = z + (1 + z)vr/c, where is the actual redshift, and Vr is the peculiar velocity. To avoid 
negative Zobs caused by peculiar velocities at low redshifts, a lower cut is set to z = 0.005. 
This results in less than a hundred particles bearing a negative Zobs in each run, which are 
then removed from the samples. Three light-cone outpnts along the three orthogonal axes 
are produced for each simulation, which effectively increases the nnmber of realizations by a 
factor of 3. 


5.2. Calculation of yP’o q, yP’o q, and 

Before testing the theory of recovering (eq.(59)), we need to study the factors 7j°oo> 
7 ?o,o> 0 'jto,O 5 because the hrst two are the key of the redshift distortion of bnlk velocity, 
while the last one the distortion of velocity dispersion. Below we drop (0, 0) in snbscripts 
for simplicity. 

By the dehnition of eq.(50), the factors 7 ?’^ and 7 ?’^ are given by 



We plot 7 ?’^ and 7 ?’^ in Fig. 2 for the D4 wavelet. It should be pointed out that 7 ?’^ depends 
only on the shape, not the scale j of the scaling fnnction. For instance, 7 ° 2 3 = 7 ° 3 4 = • • • = 
7 ° 5 g, because the 3-D cells of j = (2, 2, 3), (3, 3,4)..., (5, 5, 6 ) have the same shape, i.e. all 
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the ratios of length : width ; height of these cells are 2 : 2 : 1 , even though they have different 
volumes (scales). For cubic cells, i.e. ji = j 2 = js = j, we have 7 °^ ~ 1/3, and ~ 1/5. 

The factor 7 ?’'’ is given by 



It depends on both the scaling function and the velocity dispersion a/®, and therefore it is 
model-dependent. 

The velocity variance a/® is scale-dependent. It is shown in Fig. 3 for all the models, 
which is similar to Fig. 6 in Yang et ah (2002). Actually, af® is given by all the power of 
pairwise velocity on scales less than the scale of j, and therefore, it is larger on larger scales. 

To avoid crowding the hgure, we only show 7 j/j 3 for LCDMl in Fig. 4. Other modes, 

2020 20 
such as 7 ’ ■ • , follow 7 ’- • closely. An interesting feature seen in Fig. 4 is that y ’ n and 

Jj’js are almost equal to 1 from j = 2 to 6 , which corresponds to spatial scales from 200 

to 12.5 h“^ Mpc on the celestial sphere, and 200 to 100 h“^ Mpc along the LOS. Even for 

7 ^/ 4 , the dependence on j is only mild. Moreover, this property is model-independent, i.e. 

for other models, we also have the j-independence of 2 and 

The factor 7 ^’°„ is sensitive to a/® (as a variable) when the scale along the LOS is 
small, but insensitive when the scale is large. For example, when oL® varies from 0 to 500 
km s“^, 7|’°3 drops from 1 to 0.97, i.e. the change is only 3%, while y^’^g from 1 to 0.46, i.e. 
the change is by a factor of 2. If we try to use a scale-independent a'’ to recover the real 
space power spectrum, we should chose a'" to give a good htting on small scales, because the 
value of does not signihcantly affect the large scales. 

In addition to js, 7 ?’° depends on ji and j 2 (collectively j^) through crj’®, which is less 
than 500 km s“^ (Fig. 3). Therefore, when the LOS scale is above 100 h“^ Mpc, y?’*^ is 
almost a constant with respect to j^. On the other hand, when the scale is below 50 h“^ 
Mpc, 7 ?’° shows a mild dependence on j±_ due to the scale dependence of aj’®. 

In a word, we beneht from the DWT representation to see the different behavior of the 
two types of the redshift distortion. The former is sensitive to the shape of the mode, not 
the scale, while the latter just the contrary. 


5.3. Recovery of D-^ 

We neglect the effect of selection function here for the simulation samples, which is 
similar to that on the DWT power spectrum tested in Yang et al. (2002). Since the matrix 
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7 j“n/ is quasi-diagonalized with respect to (1,1'), we have approximately 7 ?’^ ~ 7 j^’° 7 j°’^- Thus, 
eq.(59) yields 

Ilf^7f(l + 2/37r + /5'7r)^j. (75) 

In eq.(75), the contributions of the operators A and B to the redshift distortion are separated. 
Usually terms of B ( 7 ?’^ and 7 j°’^) are called linear redshift distortion, and that of A ( 7 ?’°) 
non-linear distortion. These terminologies may cause a confusion. Here “linear” means only 
that B is from the linear term of V 3 in the approximation eq.(48), and “non-linear” means 
that A is non-linear of the velocity dispersion It does not imply that the terms of B is 
enough for estimating the redshift distortion of a held that is in the linear regime. Since the 
velocity dispersion is non-zero in the linear regime, the non-linear distortion term of A 
may also play a non-negligible role even when the held is linear. In this paper, we follow the 
tradition to call the terms of B and A the linear and non-linear ehects of redshift distortion, 
respectively. However, it shonld be kept in mind that these names do not rehect whether 
the held is linear or non-linear. 

For j-diagonal modes, eqnation (75) rednces to D? ~ + 2/3/3 -|- l/5/3^)Dj. The 

factors within the parentheses are known as the linear redshift-to-real space mapping for 
two-point correlation fnnction (Kaiser 1987). 

Fignre 5 shows the recovery of j-diagonal Bj. It is evident that the recovery eqnation 
(75) works very well on all scales and redshifts considered, except for the SCDM model at 
= 0.71. Dne to greater (3 parameters, the distortions of the TCDM and the SCDM models 
are generally stronger than that of the rest. For these two models, D? is almost parallel to 
Dj, which indicates a very weak non-linear redshift distortion on all scales. This is consistent 
with the fact that the velocity dispersions of the two models are small on all scales compared 
to the others. Since the volnme of a cnbic cell with 12.5h“^Mpc on each side is approximately 
the same as that of a sphere with a radius of 8 h“^Mpc, the values of Dq q q are consistent 
with (t| at corresponding redshifts for each model. The differences, nevertheless, are dne to 
the difference in the window fnnctions. 

Similar to Dj, one can generalize Ug to which is the rms density flnctnation in a 
sphere of radius R. Since the behavior of is the same as Dj jj, one can recover the real 
space (Tr from the redshift space in a similar way as eq.(75), i.e. ~ + 2/3/3 -|- 

1 / 5 / 32 )] VVk, where j is chosen to match the volnmes of the DWT cell and the spherical 
top-hat window of radins R. Fignre 6 demonstrates the recovery of for all the models. It 
is a coincidence that af ~ ug for the low density models. 

Fignres 7 and 8 give the recovery of off-diagonal Dy This recovery works well on 
scales above 50 h“^ Mpc. On scales less than 50 h“^ Mpc, the error is noticeable bnt still 
small. This is partially dne to the approximation of the qnasi-diagonality of the covariance 
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(ej,iej,i') — (5n/(e?i) in equation (56). Moreover, the separation between the factors involving 
operators A and B in eq.(75) is not perfect as well. 

The errors of recovering Bj can be more clearly seen via a comparison with the recovery 
of the DWT power spectrum from redshift space to real space Pj. It is (Yang et ah 2002) 

pS ^ ^ 2/3r°’^ + /3^r°’^)Pj, (76) 

where T?’^ is dehned by eq.(50) but with replacement of to ifjy The factors T?’^ and T?’^ 
also satisfy T^’j^ ~ 1/3 and ~ 1/5. The non-linear term r?’° behaves the same as 7 ?’°, 
but with a little stronger dependence on j. 

We hnd that the recovery of Pj by eq.(75) has an error within 10% in most cases, while 
the DWT power spectrum recovery eq.(76) is accurate to 5% or better. This is because 
that the quasi-diagonality of the SFC’s covariance (ej^iejj/) ~ 5y/(ej i) is poorer than the 
quasi-diagonality of the WFC’s covariance ~ 5n/(e?i). As we know from the DWT 

analysis, the covariance is always quasi-diagonal for a Gaussian held, regardless the 

power spectrum^. On the other hand, the quasi-diagonality of the covariance (ej is not 
a generic mathematical property, but only an approximation. 


5.4. Problems with real samples 

All the above algorithms can be applied to real samples as well. However, there are 
several problems to be considered in the analysis on real data. 

1. Data assignment on grid 

To calculate the moments of counts-in-cells, one has to cast a grid on the data set either 
explicitly or implicitly, and assign the data in cells dehned by the grid. The arbitrariness of 
the assignment may lead to signihcant error (Colombi, Bouchet & Schaeher 1995). One way 
to reduce this error is to shift the grid. It is well known that this assignment may also result in 
spurious features of the power spectrum on scales around the Nyquist frequency of the grid. 
In the DWT analysis, one can use diherent grids, but no shift is needed. For a given grid, the 
assignment is realized by the same scaling function used for the data decomposition. Since 
scaling functions are orthogonal and complete in each given scale j, the spurious features 
and false correlations can be completely avoided (Fang & Feng 2000). To test the ehect 
of the assignment on the one point statistics, we calculate Dj j j for a snapshot sample of 


^This is the property employed for data compression of the DWT analysis (Louis, Maass and Rieder 1997) 



LCDMl with randomly shifted grids. The snapshot sample is take at z = 0.11. In Fig. 9, 
Snap-0 represents withont grid shifting, while Snap-32 and Snap-1024, respectively, 
the average over 32, and 1024 random shifts for grids on all scales. No signihcant deviation 
is detected among the resnlts. 

2. Boundary condition and edge effect 

In a real survey, the edge effect is unavoidable due to the geometry (e.g. Szapudi & 
Colombi 1996). Since the DWT bases are localized in physical space, the effect of edges can 
be effectively suppressed by dropping modes that are close to the boundary of the samples. 
In other words, when calculating Hj, the averaging in eq.(17) runs only over modes that are 
not signihcantly affected by the edge effect, and the normalization factor L 1 L 2 T 3 is replaced 
by the volume over which the average takes place. This method has been tested numerically 
in the power spectrum detection (Pando & Fang 1998b). It shows that the power spectrum 
can be fairly reconstructed regardless whether applying a periodic boundary or zero padding 
outside of the samples. The treatment of dropping edge modes has also been successfully 
employed to obtain the power spectrum for the Las Campanas redshift survey of galaxies, 
which has a slice-like geometry (Yang et ah 2001b). When the interested scale is close 
to the dimension of the sample, one cannot afford to drop all the edge modes, and then a 
more thorough treatment is needed. The edge effect exists in our simulation sample because 
the light-cone output is not periodic along the redshift axis, while we have used periodic 
D4 wavelets in the analysis. For comparison, the real-space light-cone Dj j j from Fig. 5 is 
included in Fig. 9. There is no signihcant difference between the results from the snapshot, 
which is truly periodic, and that from the light-cone around the same redshift. The slightly 
larger standard deviation of the light-cone result at 200 /i“^Mpc (a quarter of the simulation 
box) is due to a larger fraction of edge cells. 

3. Non-Poisson sampling 

In §3.3, we have considered the correction for Poisson sampling. It is sufficient for 
simulation samples. However, it may not be typical for real samples. For instance, some 
galaxy catalogs may be given by sub-Poisson sampling on small scales, or small halos (e.g. 
Bullock, Wechsler & Somerville, 2002). The sub-Poisson distribution is simply due to very 
low mass of the considered halo, so that it cannot host any additional objects. In other words, 
the sub-Poisson distribution is signihcant on small scales on which galaxies are anticorrelated. 
Therefore, this sub-Poisson sampling is similar to the sub-Poisson distribution of the Fermi- 
Dirac statistics (one state can host no more than one particle). The ehect of this sub-Poisson 
has been extensively studied in quantum optics (e.g. Martin & Landauer, 1992). It is possible 
to extend the Poisson sampling eq.(35) to include Fermi-like sampling. The algorithm for 
a modihed Poisson sampling has been developed by Jamkhedkar, Bi and Fang (2001) (see 
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their Append B). 


6. (3 estimators with the DWT moment of one-point statistics 


6.1. (3 estimator with scale-decomposed quantities 


We first consider a P estimator without non-linear effect. For instance, 'yf ]^2 — 'y‘j’j ,3 — ^ 
(Fig. 4), i.e. the non-linear effects are small above 100 h“^ Mpc along the LOS, eq.(75) gives 


D 




- (1 + 2/^7is + 


0,2 

'jlji J3 


)D. 


31 , 32,331 


J3 = 2,3. 


(77) 


It is easy to construct a P estimator with eq.(77), because the quantities Hj, D?, and 7 ?’^ 
are not rotationally invariant, but they satisfy the following symmetries with respect to the 
triple indices (ji, j 2 , js)- 


1. If the cosmic density and velocity helds are statistically isotropic, in real space is 
invariant with respect to cyclic permutations of index j = (ji,^ 2 , js)) i-e- 


^ 3 i, 32 ,j 3 ~ ^33,31,32 


~ ^32,j3,3l- 


(78) 


2. In the plane-parallel approximation, i.e. the coordinate is in the redshift direction, 
we have 


■ 

31,32,33 32,31,331 


a,6 _ a,6 

'3jl,j2,j3 ~ 3'j2,jl,j3' 


(79) 

(80) 


Using eqs.(77)-(80), we have a P estimator as follows 


nS 

^ 3 , 3,2 


1 -I- 2/37°2 3 -I- P'^'y, 


3,2,3 


'Hi3 


1 + + ^^^y2 


0,2 


(81) 


Eq.(81) looks very similar to the P estimator with quadrupole-to-monopole ratio, or the 
multipole moments of two-point correlation function. However, eq.(81) contains not only 
the information of shape (like multipole moments), but also scales. All the DWT quantities 
in eq.(77) are on scale j. That is, the P in eq.(77) refers to mode j = (ji, J 2 ,is)- Therefore, 
if P is scale-dependent, the P estimated by eq.(81) is its value referring to j. On the other 
hand, if P is scale-free, the values of P given by estimator eq.(81) with different mode j 
should be the same. 


Figure 10 plots the results in the case of a scale-free P for the LCDMl model using 
estimators with j = 2,3,4. The results indeed show that the estimated P is independent 
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of the j used in the estimator, and it follows the theoretical curves in the entire redshift 
range considered. Therefore, we expect that the estimator (81) would be useful to study the 
scale-dependence of bias parameters of galaxies. 

For comparison, hg 11 shows the (3 estimated from quadrupole-to-monopole ratio as 

pm ^ 

pi{k) i+ip+m 

where Pi{k) and P^^k) are the quadrupole moment and the monopole moment of the power 
spectrum P‘^(k) respectively (Cole, Fisher & Weinberg 1994, hereafter CFW; Hamilton 
1997). We average the results from wavelength 50 h“^Mpc to 400 h“^Mpc for each realization. 
Thus the error bars are among different realizations of each model. As noticed in CFW, this 
estimator gives a lower (3 than the true value at ~ 0 even for wavelengths > 50 h“^Mpc. 
In addition, we hnd that it progressively overestimates (3 at higher and higher redshifts. It 
is suggested in CFW that the underestimate at short wavelengths is due to the impact of 
non-linear gravitational clustering. However, this explanation is still difficult to apply for 
the underestimate at long wavelengths, and the overestimate at high redshift, as cosmic held 
on long wavelengths or high redshifts should be linear. 

This discrepancy occurs because the quadrupole-to-monopole ratio eq.(82) is scale- 
dependent even when the held is in linear regime. This scale-dependence is given by the 
non-linear redshift distortion, or the distortion of velocity dispersion. As mentioned in §5.3, 
is non-zero in linear clustering regime, the non-linear redshift distortion (or the distortion 
of (7^^) should be considered regardless whether the held is linear or non-linear. Therefore, 
the discrepancy in CFW cannot be eliminated for long wavelengths or high redshift. To solve 
this problem, extra information, such as the power-spectrum index of the density huctua- 
tions, or the real-space correlation function, is added in the quadrupole-to-monopole ratio 
estimator eq.(82) (Peacock et al. 2001). 


6.2. (3 estimator considering with non-linear redshift distortion 

Diherent from the quadrupole-to-monopole ratio, the DWT (3 estimators are scale- 
decomposed. It is useful to consider scale-dependent ehect. The DWT estimators are able 
to consider the non-linear redshift distortion (a^® redshift distortion) without assuming extra 
information. Actually, the result with estimator eq.(81) has already shown the effect of the 
non-linear redshift distortion. We can see from Fig. 10 that the (3 estimated by eq.(81) are 
slightly dependent on j, and it is progressively higher in the order j = 2,3,4. This can be 
explained with Fig. 4, which shows that the factor 7 j ’ is less than one, and is smaller for 



24 


a smaller scale. Thus the estimator eq.(81), which ignores the factor 7 ?’° in eq.(77), leads 
to a progressively higher (3 in the order j = 2,3,4. Thus, a simplest way to estimate the 
non-linear effect is to take an average over (3 given by eq.(81) with different j. The left panel 
of Figure 12 presents the averaged (3 from Fig. 10 for the four models. The error bars contain 
the contribution of the non-linear effect. That is, the DWT algorithm can do a self-test on 
whether the result is largely affected by the non-linear effect. 


More delicate (3 estimators can be constructed if we consider the following properties: 
the non-linear factor 7 ?’° depends only on bnt independent of j± if the LOS scale is above 
100 h“^Mpc, while the linear factors 7 ?’^ and depends only on the shape of the DWT 
mode (§5.2). These properties apply to Fj , Fj , and Fj as well. Thus, we can combine 
modes with similar bnt different shapes to cancel the non-linear factors, snch as 


■p2,0 ■p2,0 

^ i,2,3^/,3,2 

■p2,0 ■p2,0 

^ /,2,3^ i,3,2 




1 , 


J ^ /• 


(83) 


Thus, from eq.(76), we have a f3 estimator as 


tdS pS 
jr',2,3-'i',3,2 
pS pS 


(1 + 2(3T%, + + 2(3Tf,, + (3^Tf 


j',3,2) 


(1 + 2/3F°;^2,3 + + 2/5rj3^,2 + 


■^0,2 


0,1 


■^0,2 


(84) 


This estimator is similar to that in Yang et ah (2002). It can be used for any pairs {j 7 ^ 
j') even when the scale of j is small. The right panel of Figure 12 shows that for the four 
models, the (3 estimated by eq.(84) has no more than abont 15% error at all redshifts z <?>. 
One can also use the modes (j, 3,4), (j, 4, 3) to replace modes (j, 2 , 3) and (j, 3, 2 ) in eqs.(83) 
and (84), which gives similar results to Fig. 12. That is, the weakly non-linear redshift 
distortion can be considered by the DWT (3 estimators on scales nntil abont 50 h“^ Mpc 
when the quadrupole-to-monopole estimator shows signihcant errors (Fig. 11). 

The connterpart of eq.(84) with Dj, 


r)S pS 
^j,2,3-‘^j',3,2 

7W 7W 

^j',2,3-^j,3,2 


0,1 


(1 -f 2/3'yj^2,3 


+ /5^7?i)(l + 2/57?42 + /3^7-;^ 


i',3,21 


(1 -f 2 / 37 °; 2,3 + /^^77,2,3)(^ + 2/^7y3,2 + /^^7y3,2 


0,2 


0,1 


0,2 


(85) 


is not as good as eq.(84). It causes an error of about 40% in f3 because again the quasi- 
diagonality of the SFC’s covariance is poorer than that of WFC’s. Since a DWT decomposi¬ 
tion of a random held yields variables for calculating both the power spectrnm and one-point 
statistics, the (3 estimations with eqs.(81) and (84) can be done in the same time. 

An accurate (3 estimator requires the knowledge of a precise recovery of Pj or Pj, which 
accounts for the non-linear redshift distortion due to the velocity dispersion, and even the 



second order effect of the bulk velocity ignored in eq.(48). In other words, the non-linear 
redshift distortion exists even on scales that are linear in the sense of structure formation, 
and it must be corrected to achieve a reliable and accurate 13. 


7. Discussion and Conclusion 

We have developed the one-point statistics of a perturbed density held with the mul- 
tiresolutional decomposition based on discrete wavelet transform. Since the scale and shape 
of the DWT bases are well dehned, this frame work is very effective to deal with problems of 
how the one-point distribution and its moments depend on the scale and shape of the window 
function. With this property, we have established the algorithm of one-point variable and 
its moments in considering the effects of Poisson sampling and selection function. We have 
also established the algorithm for recovering the DWT one-point statistics from the redshift 
distortion due to bulk velocity, velocity dispersion and selection function. 

Because the recovery of the real-space DWT one-point variable and its moments can 
be realized scale-by-scale, one can design (3 estimators which are sensitive to the scale- 
dependence of (3, for instance, caused by the scale-dependence of bias parameter of galaxies. 
These f3 estimators are effective in avoiding the difficulty caused by the scale-dependence of 
the non-linear redshift distortion. Compared with conventional (3 estimators (Peacock et al 
2001), the DWT f3 estimators do not need to assume that the velocity dispersion is scale- 
independent, or to add extra information, such as the power-spectrum index or the real-space 
correlation function of the held. Numerical tests by N-body simulation samples show that 
the proposed estimators can yield the correct value of (3 with about 15% uncertainty for all 
popular CDM models in the redshift range z <?>. 

Since DWT decomposition contains two sets of bases, the scaling function 
wavelet a DWT decomposition of a density held 6{x) actually yields variables for 

one-point statistics ej^i = f </)j^[(x)6(x)dx as well as the variables for calculating the power 
spectrum = f %ljj^i{x)5{x)dx (Fang & Feng 2000). In this sense, one can say that the DWT 
decomposition unihes the algorithm of CiC and power spectrum, which are only two aspects 
of the statistics with the DWT variables SFCs and WFCs respectively. For a hnite size sam¬ 
ple, these two aspects of statistics can play diherent roles: the former contains information 
of the perturbations on scales larger than the size, while the latter does not. However, the 
latter generally has diagonalized covariance, while the former does not. Therefore, the (3 
estimators with WFCs are better than SFCs, while SFCs are useful to estimate the effect of 
perturbations on large scales. 
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Actually, the DWT decomposition can provide more types of statistics variables. For 

2- D or 3-D samples, we have variables dehned as 

J <^(x)0ii,u(a^i)^i2,Z2(a^2)^i3,«3(a^3)c?x, (86) 

j <^(x)0ji,/i(a;i)0j2,«2(a;2)^/’j3,z3(a;3)dx. (87) 

Obviously, these variables are sampled partially by the scaling function 0^ ;(a;), and partially 
by the wavelet Statistics with these variables are not typical CiC, or power spectrum. 

They are, however, useful to study the one-point statistics, or power spectrum of 2-D and 

3- D samples. With the method developed in this paper, it is not difficult to calculate 
various corrections (Poisson sampling, selection function, redshift distortion) on the one- 
point statistics with variables (86) or (87). 


HZ is grateful to Daniel Eisenstein for extensive discussions on this paper and facilitating 
the LCDM2 simulation. HZ would also like to thank David Burstein for hosting the utility 
codes in this paper. 


A. Calculations of 7?’^, and 
A.l. 7j’ 

Let us consider the plane-parallel approximation, i.e. coordinate X 3 is in the redshift 
direction. By definition eq.(50), we have 

= y 0j,i(x)^V“2Vj,i(x)dx. (Al) 

Because 1-D scaling function 4>j^i{x) is given by dilating and translating the basic wavelet 
0 ( 7 ) as 

t) 


-jvW = I </>(--0, 


(A2) 


the Fourier transform of ipj^iix) is 




(A3) 


and 


hAn) = { , 


(A4) 
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where 0 (n) is the Fourier transform of the basic scaling function 

/ OO 

(A 5 ) 

•OO 

The function \4>j,i{n)\‘^ is shown in Fig. 1 . 

Thus, equation (Al) becomes 

= (f) l 0 («i) 0 M 0 MP, (A6) 

ni,772,713 = 00 ^ ^ 

where vector k = 27 r(ni/Li,-^2/^2,ns/La), and u = (ni/ 2 -^b''^2/2-^^,Since 0 (n) is 
non-zero only around |n| < 1, the summation of equation (A6) actually only over numbers 
of Ind < 2^\ 

\ ^ I rs-/ 

If Li = L2 = T3 = L, equation (A6) becomes 

T 00 2^ 

ln,n,3^ = 2U+J2+J3 E (^) l 0 («i) 0 (w 2 ) 0 Mr (A 7 ) 

771,772,773 = 00 


A. 2 . yf 

If u’'(x) is independent of x, 7?’° is 

00 

= 2ii+i2+j3 | 0 (Mi) 0 (M 2 ) 0 (M 3 )rexp[-(a^V-f 7 )^(r ■k) 2 ]. (A8) 

771,772,773 = —00 

In the plane-parallel approximation, we have 

- 00 

7 'im 2 j 3 = ^ l 0 (w 3 )rexp[-(cT^ 3 A; 3 /i 7 )^]. (A 9 ) 

n 3=—00 

The summation of equations (AS) and (A 9 ) also runs only over numbers of \ni\ < 2^\ 


A S 


Using the results for 7?’^ and 7^^, it is easy to hnd 

2b 


2,0 


a.b 


2h+j2+j3 \ 

77i ,772,773 = —00 




(AlO) 


cospTTu • (1' - 1)] exp[-(a/2)(a’'3/c3/i7)^]|0(ui)0(u2)0(M3)P 



It is obvious from equation (AlO) that 7 j^n/ is symmetric with respect to 1 and 1 ', and it 
depends only on the difference (|/i — /'il, I /2 — ^ 2 !) 1^3 ~ ^sD- Consequently, is independent 
of 1 . In addition, the elements are dominant over with 1 7 ^ 1 ', because the latter 
sums over oscillating terms. 


A. 4 . 


The quantity is given by eq.(64) 



For index 1 , 1 ', It depends only on the difference I 3 — /g. We have 

1 °° k 

QiXr = Qi,i3-i'3 = 2h+j2+j3 ^ sin[27ra3(/3 - /g)] |0(ui)0(m2)0(m3) T- 

ni ,n2,n3=oo 

Equation (A12) gives 

Qi,h-i3 = 0 ) if h — = 0 . 


(All) 


(A 12 ) 


(A13) 


Since 


2J3-1 


sin[27™3(/3 - I's)] sin[27ra'(/3 - /g)] ^ | othLwhe 

l3-l'-,=0 ^ 


we have 


E 


- -3 — \k^ 

Zs-ig ni,n2,n3=-oo 


i.^3 I 3 22(ii+j2+i3) 2 ttu 

Thus, equations(A7) and (A15) yield 


-I -1 / 7 

(il) i<^(“i)<^(“2)0(M3)r 


13 -V 3 


13 - 1 ' 


< 


{27rf/^L^ 


1 2 


2 h 




0,112 


(A14) 


(A15) 


(A16) 
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X n 


Fig. 1.— The scaling function 4 >2,o{x), the wavelet '^ 2 , 0 ( 2 ^), and their Fourier transforms 
| 02 ,o(^)P; |03,o(^)P; a^d |t/’ 2 ,o(^)P; where n = kL/27i, and L = 1. 



Fig. 2.— The factors of linear redshift distortion 7 ?’^ and 7 ?’^ in equation (73). The subscript 
js indicates the xa-direction, or the redshift direction. This convention is followed in all the 
hgures. These factors depend only on the geometry of the DWT cells. 







32 


j = 2 3 4 5 6 



Fig. 3.— The real-space velocity dispersion as defined in equation (42). Only the j-diagonal 
modes, i.e. the modes in cubic cells, are plotted. The units of velocity is physical. The scale 
in 1-D is R{j) = 2~^L, where L = 800h“^Mpc for all models. This definition is assumed in 
all the figures. The result of LCDM2 model is always plotted without error bars here and 
below, since there is only one realization for this model. 
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200 100 50 25 12.5 200 100 50 25 12.5 

Scale R{j) (/?.“^Mpc) Scale R{j) (/?.“^Mpc) 


Fig. 5.— The j-diagonal DWT 2nd moment Dj recovered using equation (75). The thick 
lines are at z = 0.11, for the LCDMl and the OCDM models, and 0.08 for the SCDM and 
the TCDM models. The thin (lower) line in the upper left panel is the LCDM2 model at 
= 0.10 and multiplied by 0.5. The other thin lines are at z = 1.13 for the OCDM model, 
and 0.71 for the SCDM and the TCDM models. The redshift quoted here is the redshift 
at the center of each light-cone output. For clarity, the real-space Djjj is plotted without 
symbols or error bars, and the redshift distorted Djjj is plotted without symbols. These 
treatments also apply to Figures 7 and 8. The SCDM and TCDM models have signihcantly 
weaker initial power at small wavelengths, so the Poisson noise in the simulations is dominant 
on small scales at high redshifts. For this reason, the 2nd moment is not shown for 

the SCDM and the TCDM models at z = 0.71. 
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Fig. 6.— The recovery of the rms density fluctuation in a sphere of radius R, via 
~ + 2/3/9 + l/5/9^)]^/^(Ti:j. The models are shifted for easy reading. The recovery 

is performed where the volume of the DWT cell conveniently matches that of a sphere of 
radius R. 
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Same as Figure 5, except that this is for off-diagonal modes Dj^ 2 , 3 - 














LCDMl 


OCDM 
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j = 2 3 4 5 6 



Fig. 9.— Snap-n, n = 0, 32,1024, is calculated from the snapshot output with n random 
shifts in grid. Snap-0, Snap-32, and the light-cone data are displaced horizontally with 
respect to Snap-1024 data for easy identihcation. 



Fig. 10.— The (3 parameter estimated by estimator eq.(81) for the LCDMl model. 
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Fig. 11.— The (3 parameter estimated by quadrupole-to-monopole ratio. The TCDM model 
is shifted to the left for clear identihcation. The lines are theoretical (3 from equation (69) 
for each model. 
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Fig. 12.— The left panel is the (3 parameter given by an average over that estimated by 
eq.(81) with modes j = 2,3,4 for the four models, and the right panel is the (3 parameter 
estimated by eq.(84). The TCDM model is shifted to the left for clear identihcation. The 
lines are theoretical (3 from equation (69) for each model. 








